1 First steps on PhytoBacExplorer

1.1 Register!

Register on PhytoBacExplorer (https://phytobacexplorer.warwick.ac.uk/).

This opens the registration dialogue

.

1.2 Explore landing page.

After successful registration, we explore some of the most important aspects of the landing page:

2 Interactive exercise: download target and neighbour assemblies for diagnostic primer generation for Xanthomonas vasicola pv. vasculorum

2.1 Define and load targets and neighbours (=off-targets)

As an example we will load Xanthomonas vasicola. You can choose any other target and neighbour (= off-targets) species.

In the example, our target is vasculorum. Our neighbours are all other Xv assemblies that are not vasculorum on PhytoBacExplorer.

2.2 Open “Search Strains dialogue”

Other options for Field in the Strain Metadata tab include:

  • Uberstrain: Certain records are duplicated in that there are many entries for what is essentially the same strain. This can skew analysis because your analysis may produce false clusters, which in reality are just the same strain. Thus in Enterobase such entries are merged and a single Uberstrain is created.
  • Name
  • Species
  • Pathvar
  • Race
  • Barcode: not the assembly barcode that will be the name of the downloaded assemblies
  • Data Source: Accession No. etc.
  • Project: Bio Project ID etc.

Other Operator options include:

  • contains: all entries where the Field contains the Value
  • in: clicking into the Value field will open a dialogue which allows you to insert lists of values separated by comma or whitespace.
  • not contains: all entries where the Field does not contain the Value
  • equals: all entries where Field is exactly what is defined in Value
  • not equals: all entries where Field is not exactly what is defined in Value

Below that, the AND button will allow you to search the same or different Fields that are contain, not contain etc. values. Entries returned will have to fullfill all defined criteria

This means that when you search for two different species using AND no entries will be returned because there are no entries that are two different species at the same time

Alternatively, you can use the OR button, which will return any entrie that has a Field of a certain Value or other Fields and Valuess.

For instance you, you can search for entries that are fragariae or campestris, which will return all assemblies that are fragariae and all that are campestris

2.3 Select assemblies and generate GrapeTree

We first select all assemblies:

then, in the top right corner we change Experimental Data to “Almeida_6_gene_MLST” and click the Grape Tree symbol:

The Create GrapeTree dialogue opens and we can give the tree a name, as well as chose between minimum spanning tree algorithms or neighbourhood joining algorithms. Both generate trees based on MLST allelic profiles 1. Since we do not have missing data (all entries have defined STs), I chose a simple NJ algorithm (Ninja NJ).

Suggestion: if you want to, try out multiple different algorithms and compare topology.

2.4 Annotate your GrapeTree

We can annotate our tree by clicking on the legend and choosing which metadata we want to colour it by:

Suggestion: try colouring by other metadata, like country, too.



We can also annotate the nodes and show their labels. We are interested in the STs:

2.5 Extracting the relevant STs of our target pathovar

Our target STs in the Example are 67, 138 and 399.

We can press Shift and click on the nodes to select them. Then we can click Load Selected in the left panel.

We then see the metadata of all the strains.

We can right-click > select all again and then download the selected target assemblies for further downstream processing:

{80%}

3 Further Reading:

3.2 on MLSTs:

4 Questions?


  1. compare https://enterobase.readthedocs.io/en/latest/grapetree/tutorial-2.html?highlight=grape%20tree and Z Zhou, NF Alikhan, MJ Sergeant, N Luhmann, C Vaz, AP Francisco, JA Carrico, M Achtman (2018) “GrapeTree: Visualization of core genomic relationships among 100,000 bacterial pathogens”, Genome Res. doi: https://doi.org/10.1101/gr.232397.117↩︎